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POr.YKUOI.i; pTTDE SEOPBNCTNG ME^THOP AND KITS THEREFOR 

The present invention relat&s to a method of 



which cleave the polynucleotide at a site away from the 
restriction enzyme '3 recognition site and to kits for use 
with such a method. 

With the advent ot genetic engineering it has become 
possible to isolate polynucleotide fragments and to 
determine their nucleotide sequence. Typically the 
polynucleotide fragment of interest is first amplified in 
order to generate enough sequencing template, prior to 
determining its polynucleotide sequence. This may be 
achieved, for example using polymerase chain reaction (PCR) 
techniques or polynucleotide cloning methodologies. 

However/ it is generally difficult to sequence large 
polynucleotide (eg. DNA) fragments (ie. greater than about 
500bp-lkbp) , due to the limitations of sequencing 
methodologies* It is often therefore desirable to cleave 
large fragments into more manageable smaller fragments and 
to sequence these smaller fragments. The sequences 
determined can then be reasse'mbled into a single 
polynucleotide sequence. 

One technique of obtaining smaller fragments is known 
as shotgun cloning. Typically, a large DNA fragment is 
completely digasitad, using a frequent cutting restriction 
enzyme, such as Sau3AI, into much smaller fragments, A 
vector, for example a plasmid, is digested with a rarer 
cutting enzyme (e.g. BamHI) , so that the vector is cut only 
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once and so as to give complementary enc^s to those 
generated by the frequent cutting enzyme. The small Sau3A.l 
digested polynucleotide fragments are then cloned into th.e 
vector to allow sequencing. 

However, such a strategy is not attractive because the 
ends of the DNA fragments produced by digestion are 
identical and so it is not easy to reassemble them into the 
order in which they occur in the large fragment without 
resorting to some form of restriction mapping * 
Additionally, it is possible to fail to identify colonies 
containing vectors with very small inserts since sucti 
colonies can appear blue using conventional blue/whit© 
selection* Unless an accurate restriction map has been 
determined/ it is possible to fail to identify that such 
small inserts of sequence are missing from the whole 
sequence and consequently ascertain the polynucleotide 
sequence of the larger fragment incorrectly. 

Thus, it is generally necessary to perform further 
sequencing experiments in order to confirm the restriction 
sites and ensure that all fragments have been cloned and 
sequenced. It will be appreciated that this process can be 
very time consuming and expensive to perform • 

An alternative is to carry out a partial digest, again 
using a restriction enzyme such as sau3Al, The partial 
digest is intended to generate a series of overlapping 
clones which can be sequenced and the matching sequences 
aligned so as to form a contiguous overlapping sequence. 




Oi g fi. S: JMi|7' 6 . o i, :JL i2 O :i 




wo 9&/43845 



PCT/GB99/00539 



3 



However, the conditions for carrying out the partial 
digestion have to be carefully controlled in order to 
prevent complete digestion, the control of which can be 
difficult to achieve. Moreover, a significant amount of 
overlapping sequence laay bci generated which may lead to 
some sections of the DNA being unnecessarily sequenced, 
which again wastes time and resources* 

Another system for sequencing large fragments of DNA 
is based on the procedure developed by Henikoff (Henikoff , 
S. (1984) Gene za, 351) , in which exonuclease III (Exolii) 
is used to specifically digest DNA from a 5' protruding or 
blunt-end restriction site. The other end of the DNA is 
protected from digestion by Exolll by a 4-taase 3 ' overhang 
restriction site or by an alpha-phosphorothioate filled 
end. 

Typically Exolll is added to a sample of linearised 
vector containing insert DNA and digestion started. 
Samples of the Exoill digestion are removed at timed 
intervals and added to tubes containing SI nuclease, which 
removes the remaining single-stranded tails. The ends are 
blunt-ended and ligated to re-circularise the now deletions- 
containing vectors. 

The generation of ordered sets of deletions by this 
method relies on the uniform digestion rate of ExolXl, 
However, Exolll will also digest from nicks in double- 
stranded DNA* It is therefore important to minimise the 
proportion of nicked molecules in the starting DNA, by 
purifying the DNA using special technigues. 
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Moreover, the ExoIXI process is (generally only 
suitable if the restriction enzyme sites which linearise 
the vector are not present in the insert, the probability 
of which decreases with increasing insert size. 
Furthermore the ExoliI process only results in DNA which 
decreases in size from one end, since the other end is not 
' digested. Thus, subsequent sequencing only generates new 
sequence from one end. 

There is thus the need for a more efficient and easier 
process which will allow large fragments of polynucleotides 
(e.g* DNA) to be sequenced. 

It is therefore among the objects of the present 
invention to obviate and/or mitigate at least one of the 
above described disadvantages. 

The present invention provides a method of determining 
the nucleotide sequence of a polynucleotide, comprising the 
steps of: 

a) cleaving the polynucleotide with a restriction enzyme so 

as to generate two or more fragments, wherein the 

restriction enzyme cleaves the polynucleotide at a site 

* 

away from the restriction enzyme ♦£ recognition site so as 
to generate a cleaved site possessing a recessed 3 '-end and 
a 5 •-overhang of undefined sequence; 

b) f illing-in said recessed 3 ' -ends so as to form 
substantially blunt-ended fragments; 

c) cloning and sequencing said blunt-ended fragments; 

d) pairing matching blunt-ends of said blunt-ended 
fragments so as to allow said blunt-ended fragments to be 
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ordered in a contiguous over-lapping arrangeinent ; and 

e) reading said nucleotide se^juence from said contiguous 

arrangement. 

It is to be understood that the substantially blunt- 
ended fragments referred to above include fragments with 
true or perfect blunt-ends (ie blunt-ends v/hich do not 
• possess any overhang) as well as fragments whioh possess 
ends with a single-base overhang. 

The polynucleotide to be sequenced is generally 
isolated from the genome with which it is associated and 
optionally ajnplified, for example by pcR or cloning into a 
vector and amplifying the vector in a suitable host. 
Typically the polynucleotide may be greater than ikb in 
length, for example greater than lOkb or greater than 50- 
lookb in length. 

In theory the polynucleotide may be of any length. 
The suitability of said polynucleotide for sequencing will 
generally depend on the number and length of restriction 
fragments which are generated by cleavage with the 
restriction enzyme. 

Although the restriction enzymes cleave double- 
stranded DMA, the polynucleotide need not initially be 
double-stranded DNA. The polynucleotide can for example be 
single-stranded RnA which is converted to double-stranded 
CDNA by use of reverse transcriptase and DNA polymerase as 
is well known in the art* 
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The polynucieotide may be froxn any desired source. For 
example, the polynucleotide may be obtained from bacteria , 
plants, insects, viruses and animals* 

The restriction enzymes suitable for use in the 
present invention specifically generate 5 '-overhangs o:f 
undefined sequence. The restriction enzyme identifies a 
• constant defined recognition site and cleaves the dna 
within an adjacent undefined region which may consist of 
any sequence. An example of such an enzyme is Hgal* 

ffgal recognises the following recognition site with 
the recognition sequence shown underlined: 

i 

G A c a ci NNNNNNNNNNNNNNN^* 

J. CTGCGNNNNMNNNNNNNNNNs* 

I 

where K represents any nucleotide base (eg. A, c, G or T) 
and the arrows show the point of cleavage. 

Thus, Hgal cleavage at this site generates two ends 
which both possess recessed 3 '-ends and 5 '-overhangs of 
undefined sequence, one of which is: 
*'GACGCNNNNbr^' 
a. CTGCGNNNKNNNNMNs. 
By convention recognition sequences are often only 
represented by one strand only, written from 5' - 3«, For 
enzymes such as HgaJ , which cleave away from their 
recognition sequence, the sites of cleavage are indicated 
by their position, or in parentheses. Thus, the 
recognition sequence of ifgal is often represented as: 
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GACGC(N) 5/XO, or 
GACGC(5/10) 

which me^ns that the enzyme recognises the sequence GACGC 
and cleaves the DNA within an adjacent region of any 
sequence, 5 bases away from the end »»c" of tiiti recognition 
sequence on the same strand and 10 bases away on the other 
• strand. 

Examples of other restriction enzymes suitable for use 
in the present invention and their recognition sites are as 
follows: 



A1W2BI 


GTCTC(N) 1/5 


Bbvl 


GCAGC(N)8/i2 




GTCTC(N) 1/5 


BsmFi 


GTCCC(N) 10/14 


BstllZ 


GCAGC(N)8/12 


Fokl 


GGATG(N)9/13 




GCATC(N) 5/9 


^aj:iilo4l/ 




Earl/Ksp622l 


CTCTTC(N) 1/4 


BhsI/Bbvl61I/ 




Bpil/Bpuhl 


GAAGAC(N) 2/6 


Bsal/JS?co31I 


GGTCTC(N) 1/5 


BsmBl/EBp2X 


CGTCTC(N) X/5 


BspMl 


ACCTGC(N)4/S 


Gdlll 


CGGCC(A/G) (N) 1/5 


Sapl 


GCTCTTC(N) 1/4 
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Any restriction enzyme which generates a 5 ' -overharxg 
of undefined sequence may be used in the present invention - 
However, it is preferat>le that the overhang be 3 or inoi:^ 
bases in length in order to minimise the probability of sl 
chance overlap match, as will be explained in detail below. 
Typically, the recessed 3»-ends are filled in by^ 
employing a DNA polymerase and a mixture of deoxy*- 
nucleotide triphosphates (dKTPs) , ie, a mixture containing 
dATP, dCTP, dGTP and dTTP, so as to generate substantially 
blunt-ends- DNA polymerases possess the ability to aacl 
nucleotides onto an available 3»-OH group of a. 
polynucleotide chain, but cannot add bases to the 5'- 
phospate group. 

The skilled addressee is aware that DNA polymerases 
that have a "proofreading" function, such as DMA polymerase 
I, pfu and Tli exhibit 3* 5* exonuclease activity and 
produce .greater than 95% blunt-ended fragments. However, 
certain thermostable polymerases including Taq, Tfl and Tth 
polymerase add a single nucleotide, preferentially adenine, 
to the 3 '-end, so as to form a blunt-end possessing a 
single additional base overhang. However, the single 
nucleotide overhang can be used to assist with the cloning 
of the DNA, since perfectly blunt-ended fragments can be 
more difficult to clone. 

The substantially blunt-ended fragments (ie, perfectly 
blunt-ended fragments or blunt-ended fragments possessing 
a single base overhang) are cloned into an appropriately 
digested vector, such as a plasmid, phagemid or phage 
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cloning vector. Typically the blunt-ended fragments are 
cloned into a so-called polycloning region of such vectors 
which possesses a number of unique restriction enzyme 
sites. 

The polycloning region may for exainple be digested 
with a restriction enzyme which generates blunt ends, such 
• as smax or Hindi or alternatively digested with any 
restriction enzyme which generates a 5 '-overhang, since 
this may also be filled-in by a filling in reaction, to 
allow cloning of the substantially blunt-ended fragments. 

Blunt-ended fragments which possess a single adenine 
overhang may be cloned into so-called "T-tailed vectors", 
or "TA cloning vectors" such as the pGEM«a-T vector systems 
available from Promega, Southampton, UK, using techniques 
previously described in the art (see for example Clark, J. 
(1988) Nucleic Acids Research 2A, 9677 - 9686), 

Once the blunt-ended fragments have been cloned their 
nucleotide sequence may be determined using conventional 
DNA sequencing methods well known in the art. in 
particular, the sequence of the previously undefined 5'- 
overhang region of the cleavage site, which was blunt-ended 
by the filling-in process, is determined. 

Since a single cleavage reaction generates two 
identically complementary 5 '-overhangs, albeit of initially 
undefined sequence, sequencing of individual clones helps 
identify Which fragment ends were generated by a particular 
cleavage reaction. This is made possible due to the nature 
of the restriction enzymes used which generate variable 5«- 
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overhangs. 

The chances of two 5 » -overhangs , generated by separate 
cleavage reactions at different points in th.e 
polynucleotide sequence, being accidentally the same. Is 
calculated as 4 (which is the nuTnber of possible bases ie 
A, G or T) raised to the power of the length of 5'- 

overhang. Thus for a restriction enzyiae which generates a 
3-base 5 » -overhang of undefined sequence, the chances of 
any two separate 5 • -overhangs being the same is 1:4^ or 
1:64» For a restriction enzyine which generates a 5-base 
5 '-overhang/ the chances of any two separate 5 '-overhangs 
being the same is 1;1024- Therefore, providing that, 
relatively few fragments are generated by a particular 
restriction enzyme, in comparison with the probability o:f 
a chance match between any two separate 5 ' -overhangs, it i& 
possible to pair matching ends with a high degree at 
certainty that they were generated from the same cleavage 
reaction at a given point in the polynucleotide sequence. 

In this manner it is. possible ta identify all matchincg 
ends by their sequence. The matching ends can then be 
paired and the fragments ordered so as to allow a 
contiguous over-lapping arrangement of sequences to be 
generated, from which the nucleotide sequence of the 
polynucleotide may be determined. Typically, pairing of 
the matching ends and ordering of the fragments into a 
contiguous over-lapping arrangement may be carried out by 
using a computer program designed for such an application* 
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Reading of said nucleotide sequence from said 
contiguous arrangement may then also be carried out by or 
with the assistance of a computer. 

It may be appreciated that the method described herein 
may be used in conjunction with manual, semi -automated or 
fully automated sequencing apparatus known in the art. 

In manual sequencing the scientist typically reads the 
sequence off an autoradiograph taken from a gel, on which 
radioactive or chemi luminescent DNA fragments have been 
separated according to size by electrophoresis. Such 
techniques are well known in the art and are described for 
example in Sambrook, J et al (1989) Molecular Cloning: a 
laboratory manual, Cold Spring Harbor Laboratory, Cold 
spring Harbor, N.X. The sequence is then conveniently 
entered into a computer to facilitate observation and/ or 
manipulation of the sequence using appropriate computer 
software. However, manual sequencing is being circumvented 
by semi-automated or fully automated sequencing apparatus 
which can not only determine the sequence of a particular 
polynucleotide, but can input this information directly 
into a computer comprising appropriate sequence handling 
computer software. 

It is therefore immediately evident that a computer 
program designed for pairing of the matching ends and 
ordering of the fragments • into a contiguous over-lapping 
arrangement may be provided which is suitable for use with 
the method of the present invention when using manual, 
semi-automated, and/or fully automated sequencing 
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apparatus. For example it may be possible to provide 
suitable software for use in conjunction with a semi- 
automated or fully automated sequencing apparatus such that 
the fragments generated using the method of the present 
invention may be sequenced using a single apparatus linKeca 
to a computer comprising the computer software. The 
* sequences of the various fragments are determined using the 
sequencing apparatus, and the software is able to pair thia 
matching ends and order the fragments into a contiguous 
over-lapping arrangement. Thereafter the software is abXe 
to determine the sequence of said nucleotide from saLc^ 
contiguous arrangement and provided the user with a singLe 
nucleotide sequence corresponding to the original, 
polynucleotide « 



provides a computer program for use with the method as 
described herein ^ wherein the computer program serves to 
pair the matching ends of the sequenced fragments and ord^r 
the fragments into a contiguous overlapping arrangement^ 
thereafter the computer program may read from tlxe 
contiguous overlapping arrangement and provide the user* 
with the nucleotide sequence of the original 
polynucleotide. Such a computer program may be provided to 
a user of the present invention on a computer readable 
medium such as a floppy disk, CD-ROM or the like. 
Alternatively semi-automated or fully automated sequence 
apparatus with a dedicated computer may be provided with 
the computer program preloaded into the computer's meinory. 



Thus in a further aspect the present invention 
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In order to help better understand the process of 
pairing the matching ends and ordering the sequences, 
reference is made to Figure t which shows the process 
Bchematically . 

Part A of Figure l sshows flvo fragroenta (i to 5) which 
were generated from a single polynucleotide fragment which 
• had been cleaved with a restriction enzyme as defined 
above. The fragments have been blunt-ended by filling-in 
as described, cloned and sequenced. The small regions of 
sequence corresponding to the 5 ' -overlaps generated by the 
restriction enzyme are shown as different symbols. To a 
high degree of certainty only the ends generated by a 
particular cleavage reaction will be the same. Thus, for 
eicample, the right hand end of fragment S matches the left 
hand end of fragment 3 . 

By pairing the matching ends of the fragments it is 
possible to order the fragments in a contiguous overlapping 
linear arrangement as represented in part b of Figure i. 
once the fragments are ordered as shown in part the 
nucleotide sequence of the original polynucleotide can be 
easily determined (as shown in part c of Figure X) . 

In the example as represented by Figure 1 only two 
individual ends match with one another. When only a few 
fragments are generated the likelihood of more than two 
ends matching is remote. Indeed Table x shows the 
estimated average length of DNA that would be expected 
before identical restriction sites for each particular 
restriction enzyme would be observed. 
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Where there are random overlaps/ more than one 



contiguous arrangement permutation is possible • However, 
most permutations can be discounted immediately, for 
example, permutations that produce a circular contiguous 
arrangement for a DNA fragment that is linear • 
Additionally the polynucleotide could be cleaved using a 
different restriction enzyme or a partial digest performed 
in order to assist in ordering the fragments. 



restriction mapping of a polynucleotide. To achieve this, 
it is not necessary to sequence the entire length of each 
fragment, only the blunt-ends generated from the 
restriction enzyme digestion and filling-in reaction need 
be sequenced. It is then possible to order the fragments 
as described above in order to generate a restriction map. 

In another aspect the present invention provides a kit 
suitable for use in any of said processes according to the 
present invention, the kit comprising at least one 
restriction enayme as defined herein together with a DNA 
polymerase or polymerases for the filling-in and/or 
sequencing reactions. Other components such as dNTPs, a To- 
talled vector, competent cells, sequencing reagents and the 
like may also be included as appropriate. In addition a 
computer program in a machine readable form such as a 
computer disk or CD-ROM may be provided for pairing the 
matching ends and ordering the fragments into a contiguous 
overlapping arrangement and thereafter providing the 
nucleotide sequence of the polynucleotide. 



The present process may also be used to conduot 
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Th6 present invention will be further described and 
understood with reference to the following non-limiting 
Examples section. 



Exampl es SeG^^jnn 
Materials ft y^thods 
' 1. Restriction Enzyme Digests 

All restriction enzyme digests were performed on pure 
DMA using restriction enzymes supplied by Promega (Promega^ 
SouthaTupton, UK) or New England Biolabs (New England 
Biolabs, Hitchin^ UK) , Incubation conditions were 37*c for 
a minimum of 1 hour using the appropriate buffer supplied 
by Promega or New England Biolabs* Following digestion DNA 
was run on an agarose gel and gel extracted using Qiaexll 
gel extraction Kit (Qiagen, Crawley, OK) . 

2. extraction of dna from agarose .gels with the QIAEX II 
gel extraction Kit 

All DNA extracted from gels was purified using the 
QIAEX II DNA gel extraction kit according to the 
manufacturer's instructions. Briefly, three volumes of 
'QX-1' buffer and I0;xl of Qiaexll DNA binding beads were 
added to each gel plug. The plugs were dissolved by 
warming to SO^^c during which time the beads were kept 
suspended by vortexiay every 2 min. After 10 min the beads 
were pelleted by a 20s centrif ugation in a benchtop 
centrifuge. The supernatant was removed and the pellet 
washed in 500/^1 *QX-l' buffer, resuspended, and then 
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pelleted in the same manner as above. The pellet was then 



washed, resuspended and pelleted similarly in an ethanolic 
wash ^PE' buffer. The pellet was then allowed to dry for- 
10 xain and then eiuted in 20^1 of water. This DNA was 
typically contaminated with ethanol and so was sabsequentli^ 



3, Etnanol precipitation 

To the volume of DNA to be ethanol precipitated, O.L 
volume 3M sodium acetate was added and 2 voluxnes of l0O% 
ethanol- The vial was mixed and incubated at -so^'c for 30 
minutes- The precipitated suspension was centrifuged at 
liooorpm in a Jouan (HR1S12) refrigerated centrifuge for LO 
min to pellet the DNA. The supernatant was aspirated ana 
iml of 70% ethanol added • The DNA was pelleted again lay 
centrifugation at liooorpm in the refrigerated centrifuge 
for 5 Tain, the supernatant aspirated and the pellet allowed 
to air-dry for 5-10 minutes. The DNA was resuspended in TE 
buffer and the purity of the DNA checKed by UV absorption 
at 260nm and 2S0nm, where A260/A280*=l. 8 for pure plasmia 
DNA. 

4. Generation of pla3miai ONA 

Plastaid DNA was prepared using inaxiprep and miniprep 
Kits (Promega, Southampton, UK). a brief protocol for a 
ProTuega waxiprep kit is given below. 



purified by ethanol precipitation. 
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Promega aaxi-prep 

A culture was set up by stabbing a toothpick into 
frozen glycerol stocks and adding it to 400inl of ampicillin 
(SOpg/ml) LB medium. The culture was incubated overnight 
at 37'c in a rotating incubator at aoorpm. 

■ preparation of cleared lysate 

The culture was then poured equally into 250ml Beckman 
centrifuge tubes and pelleted at 950og for 10 mins at room 
temperature in a JA-14 rotor. Each pellet was resuspended 
in 7.5ml -Resuspension solution' using a heat-sealed 5ml 
pipette to manually disrupt the pellet. These suspensions 
were combined. To the combined iSml, 15ml 'cell Lysis' 
solution was added and mixed by inversion. Lysis was 
allowed to complete (up to 20 min) and then iSml of 

'Neutralisation solution' was added and immediately mixed 

by inversion. 

The suspension was centrifuged at l4,00Qg for 15 min 
at room temperature. The cleared supernatant was 
transferred to a new container. 



Plasmid DNA precipitation 

0,6 volumes of isopropanol was now added and mixed by 
inverting several times. The DNA was pelleted by 
centrifugation at 14,u00g for 15 mins at room temperature. 
The supernatant was discarded and the DNA pellet 
resuspended in 2rol TE. 



t 
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Flasmid purification 

one MaxicQlumn was inserted into a vacuum manifolci. 
lOTcii of well-shaken pre-warmed 'DNA purification resin* was 
added to the DNA/TE solution and then this slurry was added 
to the maxicolmnn; A vacuum was applied to draw the slurry 
through. The DNA/resin contained was rinsed with I3ml of 
• 'Column wash solution' and iajmediately added to the column 
under vacuum. A final wash of I2ml of 'Column wash 
solution' was then added to the column. The resin was 
rinsed with 5ml of 80% iscpropanol under vacuum. 

The resin was dried hy centrifuging the column in its 
SOml conical tube in a bench-top clinical centrifuge at 
2,500 rpro (1300 g) for 5 min. it was then transferred to 
a new 50ml conical centrifuge tube. 1.5 ml pre-heatetd 
water (S5-70'c) was applied to the tube. After 1 minute 
this water was centrlfuged out of the column using tltes 
conditions above. 

DNA solution was stored -20*0. 



5. Cloning DiTA 

Phosphatase treatment of DNA 

If appropriate (ie not necessary for TA-cloning 
vectors) prior to ligation of an insert into a vector, tlie 
plasmid DNA was treated with calf intestinal alkaline 
phosphatase (CIAP) if the vector had been digested with a 
single restriction enzyme- The clAP removes the 5' 
phosphate groups and thus prevents recirculariaation of the 
vector during ligation. 
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Heaction mix 

The folXowing was added to a microcentrifuge tube: 

vector DHA 

CIAP lOx buffer 

CIAP 

dHaO 

This was mixed gentiy and incubated for 1 hour at 37 *c. 
CIAP was removed prior to ligation by phenol/chlorof onri 
extraction. 

Double stranded PNA ligation 

Double stranded DNA with cohesive ends was ligated 
into xoong vector by adding 1 unit of T4 DNA ligase 
(Proroega) to l:l and 1:3 ratios of vector and insert DNA in 
X9.5M1 X X Ligase buffer (lOX T4 DNA ligase buffer is 30xaH 
Tris-HcX, pH7,8* lOOmM MgCla, XOOmM DTT, lOmM ATP). This 
reaction was incubated at 14 *c overnight* The ligase buffer 
was aliquoted to prevent degradation of ATP- 

6- TA Cloning 
Sample preparation 

The DNA precipitate from an ethanol precipitation was 
resuspended in a volume such that the ratio of 
concentration of the average sized insert to vector wouXd 
be 3:1 in the llgatioii. 
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Ligation 

TA Cloning depends on a property which certain 
polymerases possess in transferring single adenines onto 
the 3* end of blunt-ended DMAs- Vectors carrying- 
complementary T overhangs can ligate with these DNAs vety^ 
efficiently because neither molecule can circularise thus 
' promoting intertnolecular reactions* TA cloning was 
performed using the original TA cloning® kit or tHe 
Eukaryotic TA cloning^^ kit (both available from Invitrogen 
BV, NV Leek, Holland) (Bidirectional) as required* Th^e 
ligation reaction is carried out essentially as above, 
using the supplied precut vector containing the T overhang. 

TA cloning: trausf prmation 

An aliquot of frozen competent cells (either invaF'or 
TOPIOF* supplied by Invitrogen) was thawed slowly on ice. 
2/11 of ligation reaction and 2^1 of 0* 5Mp-Mercaptoethanol 
was added to the tube, mliced with the pipette tip and 
incubated on ice for 3 0 min. The cells were then heat: 
shocked for 30s at 42'*C and incubated on ice for a further 
2 min, 250^1 of SOC broth was then added and the 
transformed cells incubated at 37*^0 for 60 min with shaking 
(225rpm) * lOO/il of the culture was plated on a 10cm agar 
plate containing SO/ig/ml ampicillin* Transformed colonies 
were identifiable in the 'Original' TA cloning vector 
(pCR2.l) using blue/white colour selection because of 
insertion into the &--Galactosidase gene. Colour selection 
was not possible in the eukaryotic TA cloning expression 
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vector (PCR3.1). in either case white colonies were 
Picked, PCR screened to ensure an insert was present and 
glycerol stocks made of positive colonies. 



7, DtTA sequanoing with the ABI Sequencer 
Protocol for cycle sequencing 

San,ples for sequencing taken from maxi-preps or mini- 
preps were mixed with the TaqOyeDeoxy Terminator (Applied 
Biosystems, Foster City, CA, USA) reaction premix. 

Reaction mix; 



Reaction premix 
(contains buffer, 
magnesium) Q^i 
ds DNA template 
Primer (for ds DNA) 

HgO 



polymerase , aNTPs , cJdNTPs , 



400ng 
3 . 2pmol 
to 20^1 



sequencing reactions so prepared were subjected to 
thermal cycling using the following conditions: 



cycles (25) Denaturation 96«c for 303 

Annealing* 47*0 for I5s 

Extension eo-c for 4min 

* This segment temperature was variable according to the 
primer used. The temperature shown was that used for the 
T7 sequencing primer (taatacgactcactataggg) and the pCR2.i 
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upstream primer (agctatgaccatgattacg) . 

Reaction products were concentrated by ethanol 
precipitation and the pellets sent to the Glasgow 
University Molecular Biology Support Unit for gel 
electrophoresis and sequence determination using an Applied 
Biosystems fluorescent sequencer inodel 393A. 

Sequence analysis 

Routine DNA sequence handling and analysis was 
performed on the Gene Jockey II program (Biosof ti , 
Cambridge, UK) 

qenerat iQH of HcrsT diqestied fracrments 

A 2.4kb xno II fragment which had been cloned into a 
plasmid vector was re-excised using flanking EScor i sites 
and the resulting fragment was digested, as described 
above, with an 1 unit excess amounts appropriate or of Hg-^I 
restriction endonuclease. 

Digestion generated four separate fragments, two of 
0.4kb, one of 0-7kb and one of o.9kta* The fragments where 
separated by gel electrophoresis and purified from 
appropriate gel fragments, as described above. 
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The recessed 3 » -ends generated by Hgal and Xho II 
digestion of the purified fraginents were filled in using 
T^q polymerase and dNTPs as outlined below. 
Filing in reaction as follows: 
' Gel purified PNA (dissolved in 
water) "^^ final volUTue 50^1 

Tag polymerase lOX reaction 
huffer (Promega) 5^1 
dNTPs(2in]yi of each dNTP) 5^1 
asioM MgClj ^Ml 
Tag polymerase (Prowega) 2,5 units 

Incubat-ed at 65*C for xo min. 

A single TA ligation reaction was set up with the 
filled-in fragments and "original" TA vector as described 
above. 

After ligation was completed invaF* competent cells 
were transformed and plated on agar plates, containing 
IPTG/Xgal for blue/white colour selection, and colonies 
allowed to develop* 

Plasmid DNA was prepared from a selection of colonies 
and screened using PCR techniques in order to ensure insert 
was present in the vector. 

Insert containing plasmid DNA was then prepared for 
DNA sequencing. 
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ae gnenci^g the c :VQt^ed fragments and ordering- ttt^ secnA^noe^ 
Into a p oTifcicfuous arrangement 

Sequencing of the pXasmid DNA was carried out as 
described above and the sequence obtained subjected to 
sequence analysis- 

Figure 2 shows the short unique overlaps which werret 
introduced into the polynucleotide fragment followingr 
digestion with HgaX. The solid underlines show the Hgr^r 
recognition sequences and the. bold GATC motif at the ends 
of fragments l and 4 are the ends of the XhoXZ fragment 
following TA cloning (recognition sequence RGATcy where R 
G/A and V = C/T) - 

The entire sequence of the fragments is not shown 
since it is not necessary for the understanding of the 
underlying principle of the present invention- These short 
unique overlaps allowed the two 0,4kb^ one o,7kb and one 
0*9Kb fragments to be ordered into a contiguous overlappin<? 
arrangement as shown in Figure 3a. Figure 3b shows th.e 
Hgal restriction map of the original polynucleotide 
fragment. 

As can be seen from Figure 2 and Figure 3b, the 
junction between fragments 3 and 4 was more complex than 
the other junctions, due to two Hgal restriction sites 
being extremely close to each other- However, in such a 
situation, digestion at one site effectively destroys tha 
other such that an overlap can still be discerned from some 
clones , 
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TftBLB 3, 



Restriction 


KQcognition 


of cutting 
(bp) 


Probiibility 
of ch&nca 
ovarlap 
match*- (bp) 


I*ength 
of DHA 

match* 


Bbvl 
Bst71I 


GCAGC (N) 8/12 

GTCCC CN> 10/14 
GCAGC <U) 3/12 
GGATG (N)&/13 
GCATC (N)5/9 


X 3 4> 


1 in 25€ 


131S^ 




GACGC (^>5/10 


1 in 512 


1 in 1024 


S24Kt» 


Eamll04I/ 
E5t^I/K«p632I 


CTCTTC(W 1/4 


1 in 20-^8 


1 in S4 


131Kb 


BbsX/BbvlSII/ 
GcUXl* 


GAAGAC (N) 2/6 

GGICTC (N) 1/5 
CGTC3JC (N) 1/5 
ACCTGC {K>4/a 
CGGCCR (N)l/5 


1 in 2048 


1 in 256 


524Kb 


SApX 


GCTCTTCiN) 1/4 


1 in 8192 


1 in S4 


524Kb 



^ Probability of chance Taatch between two unrelated 



overhangs 

Estimated length of DNA before a random iciatch between 
unrelated recognition sequences, (Example calculation: 
A1U26I will cut on average once every 5l2bp, Each 
overhang has a 1 in 256 chance of matching. Thus 
estimated the length of DNA before two identical 
recognition sequences are observed is: 
the frequency of cutting x the chances of matching 
ie, 5X2bp X 256bp = 131Kb. 



